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Binary Codes: From Symbols to 

Binary Codes* 


Louis Scharf 

This work is produced by The Connexions Project and licensed under the 
Creative Commons Attribution License ^ 


NOTE: This module is part of the collection, A First Course in Electrical and Computer Engineer¬ 
ing. The LaTeX source files for this collection were created using an optical character recognition 
technology, and because of this process there may be more errors than usual. Please contact us if 
you discover any errors. 

Perhaps the most fundamental idea in communication theory is that arbitrary symbols may be represented 
by strings of binary digits. These strings are called binary words, binary addresses, or binary codes. In 
the simplest of cases, a finite alphabet consisting of the letters or symbols so,si, i is represented by 

binary codes. The obvious way to implement the representation is to let the i th binary code be the binary 
representation for the subscript i : 


s 0 ~ 000 = a 0 

si ~ 001 = <2i 

Sq ~ 110 = CLq 

Sj ^ 111 = CLj . 

The number of bits required for the binary code is N where 


2 n ~ 1 <M <2 n . (2) 

We say, roughly, that N = log 2 M. 

Octal Codes. When the number of symbols is large and the corresponding binary codes contain many 
bits, then we typically group the bits into groups of three and replace the binary code by its corresponding 

^Version 1.7: Sep 16, 2009 12:59 pm GMT-5 
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octal code. For example, a seven-bit binary code maps into a three-digit octal code as follows: 


0000000 


000 

0000001 

nsj 

001 

0100110 


046 

101111 


137 

1111111 


177. 


The octal ASCII codes for representing letters, numbers, and special characters are tabulated in Table 1 
(Table 1). 

Exercise 1 

Write out the seven-bit ASCII codes for A , g, 7, and{. 



T 

’i 

’2 

’3 

’4 

’5 

’6 

7 

’00x 

[U+2400] 

[U+2401] 

[U+2402] 

[U+2403] 

[U+2404] 

[U+2405] 

[U+2406] 

[U+2407] 

Tlx 

[U+2408] 

[U+2409] 

[U+240A] 

[U+240B] 

[U+240C] 

[U+240D] 

[U+240E] 

[U+240F] 

T2x 

[U+2410] 

[U+2411] 

[U+2412] 

[U+2413] 

[U+2414] 

[U+2415] 

[U+2416] 

[U+2417] 

T3x 

[U+2418] 

[U+2419] 

[U+241A] 

[U+241B] 

[U+241C] 

[U+241D] 

[U+241E] 

[U+241F] 

T4x 

[U+2420] 

! 

M 

# 

$ 

% 

& 


T5x 

( 

) 

* 

+ 

J 

- 


/ 

T6x 

0 

1 

2 

3 

4 

5 

6 

7 

T7x 

8 

9 


j 

< 

= 

> 

? 

T Ox 

@ 

A 

B 

c 

D 

E 

F 

G 

Tlx 

H 

I 

j 

K 

L 

M 

N 

O 

T2x 

P 

Q 

R 

s 

T 

u 

V 

w 

T3x 

X 

Y 

z 

[ 

\ 

] 

- 


T4x 

c 

a 

b 

C 

d 

e 

f 

g 

T5x 

h 

i 

j 

k 

1 

m 

n 

0 

T6x 

P 

q 

r 

S 

t 

u 

V 

w 

’17x 

X 

y 

z 

{ 

1 

} 


[U+2421] 


Table 1: Octal ASCII Codes (from Donald E. Knuth, The TEXbook , @1986 by the American 
Mathematical Society, Providence, Rhode Island p. 367, published by Addison-Wesley Publishing Co.) 


Exercise 2 

Add a 1 or a 0 to the most significant (left-most) position of the seven-bit ASCII code to produce 
an eight-bit code that has even parity (even number of l’s). Give the resulting eight-bit ASCII 
codes and the corresponding three-digit octal codes for %, u, /, 8, and +. 
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Quantizers and A/D Converters. What if the source alphabet is infinite? Our only hope is to approx¬ 
imate it with a finite collection of finite binary words. For example, suppose the output of the source is 
an analog voltage that lies between —Vo and +Vo. We might break this peak-to-peak range up into little 
voltage cells of size and approximate the voltage in each cell by its midpoint. This scheme is illustrated 
in Figure 1 (Figure 1). In the figure, the cell C$ is defined to be the set of voltages that fall between 
and i- v MA + ^M A : 


Ci = {V: i. 


2 Vo 
M 


M 


- M M 


The mapping from continuous values of V to a finite set of approximations is 


(4) 


Q{v) = i W’ if VeCi - (5) 

That is, W is replaced by the quantized approximation whenever V lies in cell C{. We may represent the 

2y 

quantized values i~ M 0 with binary codes by simply representing the subscript of the cell by a binary word. 
In a subsequent course on digital electronics and microprocessors you will study A/D (analog-to-digital) 
converters for quantizing variables. 


- 

-V 

0 

_ 2V 0 

2K °-J L- 


± M 

m \ r 

G* 

ill i i 1 

C 2 

1 1 1_ 

T 

|,|-|- 

-1—I-1-1-1 1 

-v 0 

1 1 1 

\ 

cell C Q 

1 1 1 

^0 

- 

~v 0 



Figure 1: A Quantizer 


Example 1 

If M = 8, corresponding to a three-bit quantizer, we may associate quantizer cells and quantized 
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levels with binary codes as follows: 


VeC_ 3 

=+ 

V_3 = (- 3 ) ~ HI 


V eC-2 


V-2 = (-2) ^ ~ no 


V € C- 1 


v i = (-i) ^ ~ 101 


VeCo 


Vo = 0 ~ 000 

(6) 

V e Ci 


Vi = (1) ^ ~ 001 


V eC 2 


v 2 = (2) ~ 010 


VeC 3 


V 3 = ( 3 ) ^ ~ Oil. 



This particular code is called a sign-magnitude code , wherein the most significant bit is a sign bit 
and the remaining bits are magnitude bits (e.g., 110 ^ —2 and 010 ~ 2). One of the defects of the 
sign-magnitude code is that it wastes one code by using 000 for 0 and 100 for-O. An alternative 
code that has many other advantages is the 2’s complement code. The 2 s complement codes for 
positive numbers are the same as the sign-magnitude codes, but the codes for negative numbers are 
generated by complementing all bits for the corresponding positive number and adding 1: 


—4 ' 

- 100 




-3 

- 101 

(100 

+ 

1) 

-2 r 

- 110 

(101 

+ 

1) 

-1 ' 

- Ill 

(110 

+ 

1) 

0 ~ 000 




1 ^ 

- 001 




2 - 

- 010 




3 - 

'Oil. 





Exercise 3 

Generate the four-bit sign-magnitude and four-bit 2’s complement binary codes for the numbers 
-8,-7,...,-1,0,1,2,...,7. 

Exercise 4 

Prove that, in the 2’s complement representation, the binary codes for —nand + n sum to zero. 
For example, 


101 + 011 = 000 

8 

(-3) (3) (0). 

In your courses on computer arithmetic you will learn how to do arithmetic in various binary-coded systems. 
The following problem illustrates how easy arithmetic is in 2’s complement. 

Exercise 5 

Generate a table of sums for all 2’s complement numbers between —4 and +3. Show that the sums 
are correct. Use 0 + 0 = 0, 0 + 1 = 1,1 + 0 = 1, and 1 + 1 = 0 with a carry into the next bit. For 
example, 001 + 001 = 010. 

Binary Trees and Variable-Length Codes. The codes we have constructed so far are constant-length 
codes for finite alphabets that contain exactly M = 2 N symbols. In the case where M = 8 and N = 3, then 
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the eight possible three-bit codes may be represented as leaves on the branching tree illustrated in Figure 
2(a) (Figure 2). The tree grows a left branch for a 0 and a right branch for a 1, until it terminates after three 
branchings. The three-bit codes we have studied so far reside at the terminating leaves of the binary tree. 
But what if our source alphabet contains just five symbols or letters? We can represent these five symbols 
as the three-bit symbols 000 through 100 on the binary tree. This generates a constant-length code with 
three unused, or illegal, symbols 101 through 111. These are marked with an ’V’ in Figure 2(a) (Figure 2). 
These unused leaves and the branches leading to them may be pruned to produce the binary tree of Figure 
2(b) (Figure 2). 

If we admit variable-length codes, then we have several other options for using a binary tree to construct 
binary codes. Two of these codes and their corresponding binary trees are illustrated in Figure 3 (Figure 3). 
If we disabuse ourselves of the notion that each code word must contain three or fewer bits, then we may 
construct binary trees like those of Figure 4 (Figure 4) and generate their corresponding binary codes. In 
Figure 4(a) (Figure 4), we grow a right branch after each left branch and label each leaf with a code word. 
In Figure 4(b) (Figure 4), we prune off the last right branch and associatea code word with the leaf on the 
last left branch. 



(a) (b) 

Figure 2: Binary Trees and Constant-Length Codes; (a) Binary Tree, and (b) Pruned Binary Tree 


All of the codes we have generated so far are organized in Table 2 (Table 2). For each code, the average 
number of bits/symbol is tabulated. This average ranges from 2.4 to 3.0. If all symbols are equally likely to 
appear, then the best variable-length code would be code 2. 


(a) (b) 

Figure 3: Binary Trees and Variable-Length Codes; (a) Binary Tree for Variable-length Code, and (b) 
Another Binary Tree for Variable-length Code 


All of the codes we have constructed have a common characteristic: each code word is a terminating leaf 
on a binary tree, meaning that no code word lies along a limb of branches to another code word. We say 
that no code word is a prefix to another code word. This property makes each of the codes instantaneously 
decodable , meaning that each bit in a string of bits may be processed instantaneously (or independently) 
without dependence on subsequent bits. 

Exercise 6 

Decode the following sequence of bits using code 2: 

0111001111000000101100111. (9) 
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# 

(a) (b) 

Figure 4: Left-Handed Binary Trees for Variable-Length Codes; (a) Left-handed Binary Tree, and (b) 
Pruned Binary Tree 


Code # 

St 

Si 

S 2 

S3 

S 4 

Average Bits/Symbol 

1 

000 

001 

010 

Oil 

100 

15/5 = 3.0 

2 

000 

001 

01 

10 

11 

12/5 = 2.4 

3 

000 

001 

010 

Oil 

1 

13/5 = 2.6 

4 

1 

01 

001 

0001 

00001 

15/5 = 3.0 

5 

1 

01 

001 

0001 

0000 

14/5 = 2.8 


Table 2: Variable Length Codes 


Exercise 7 

Illustrate the following codes on a binary tree. Which of them are instantaneously decodable? 
Which can be pruned and remain instantaneously decodable? 


So 

Si 

S 2 

S3 

S 4 

Oil 

100 

00 

11 

101 

Oil 

100 

00 

0 

01 

010 

000 

100 

101 

111. 


( 10 ) 


Code #2 generated in Table 2 (Table 2) seems like a better code than code #5 because its average number of 
bits/symbol (2.4) is smaller. But what if symbol So is a very likely symbol and symbol S 4 is a very unlikely 
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one? Then it may well turn out that the average number of bits used by code #5 is less than the average 
number used by code #2. So what is the best code? The answer depends on the relative frequency of use 
for each symbol. We explore this question in the next section. 
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